Fast Approximate Score Computation on Large-Scale Distributed Data for Learning Multinomial Bayesian Networks

نویسندگان

چکیده

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Scalable Score Computation for Learning Multinomial Bayesian Networks over Distributed Data

In this paper, we focus on the problem of learning a Bayesian network over distributed data stored in a commodity cluster. Specifically, we address the challenge of computing the scoring function over distributed data in a scalable manner, which is a fundamental task during learning. We propose a novel approach designed to achieve: (a) scalable score computation using the principle of gossiping...

متن کامل

Approximate Structure Learning for Large Bayesian Networks

We present approximate structure learning algorithms for Bayesian networks. We discuss on the two main phases of the task: the preparation of the cache of the scores and structure optimization, both with bounded and unbounded treewidth. We improve on state-ofthe-art methods that rely on an ordering-based search by sampling more effectively the space of the orders. This allows for a remarkable i...

متن کامل

Learning Hierarchical Bayesian Networks for Large-Scale Data Analysis

Bayesian network learning is a useful tool for exploratory data analysis. However, applying Bayesian networks to the analysis of large-scale data, consisting of thousands of attributes, is not straightforward because of the heavy computational burden in learning and visualization. In this paper, we propose a novel method for large-scale data analysis based on hierarchical compression of informa...

متن کامل

Learning Summary Statistics for Approximate Bayesian Computation

In high dimensional data, it is often very difficult to analytically evaluate the likelihood function, and thus hard to get a Bayesian posterior estimation. Approximate Bayesian Computation is an important algorithm in this application. However, to apply the algorithm, we need to compress the data into low dimensional summary statistics, which is typically hard to get in an analytical form. In ...

متن کامل

Approximate Bayesian Computation for Distance-Dependent Learning

The distance dependent Chinese restaurant process (ddCRP) and its hierarchical extensions provide a flexible framework for clustering data with temporal, spatial, or other non-exchangeable dependencies. The successful application of these models crucially depends on functions chosen to encode structural dependencies exhibited by the data. Designing such affinity functions is challenging and oft...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

ژورنال

عنوان ژورنال: ACM Transactions on Knowledge Discovery from Data

سال: 2019

ISSN: 1556-4681,1556-472X

DOI: 10.1145/3301304